PubChem structure–activity relationship (SAR) clusters
نویسندگان
چکیده
BACKGROUND Developing structure-activity relationships (SARs) of molecules is an important approach in facilitating hit exploration in the early stage of drug discovery. Although information on millions of compounds and their bioactivities is freely available to the public, it is very challenging to infer a meaningful and novel SAR from that information. RESULTS Research discussed in the present paper employed a bioactivity-centered clustering approach to group 843,845 non-inactive compounds stored in PubChem according to both structural similarity and bioactivity similarity, with the aim of mining bioactivity data in PubChem for useful SAR information. The compounds were clustered in three bioactivity similarity contexts: (1) non-inactive in a given bioassay, (2) non-inactive against a given protein, and (3) non-inactive against proteins involved in a given pathway. In each context, these small molecules were clustered according to their two-dimensional (2-D) and three-dimensional (3-D) structural similarities. The resulting 18 million clusters, named "PubChem SAR clusters", were delivered in such a way that each cluster contains a group of small molecules similar to each other in both structure and bioactivity. CONCLUSIONS The PubChem SAR clusters, pre-computed using publicly available bioactivity information, make it possible to quickly navigate and narrow down the compounds of interest. Each SAR cluster can be a useful resource in developing a meaningful SAR or enable one to design or expand compound libraries from the cluster. It can also help to predict the potential therapeutic effects and pharmacological actions of less-known compounds from those of well-known compounds (i.e., drugs) in the same cluster.
منابع مشابه
Multi-Assay-Based Structure-Activity Relationship Models: Improving Structure-Activity Relationship Models by Incorporating Activity Information from Related Targets
Structure-activity relationship (SAR) models are used to inform and to guide the iterative optimization of chemical leads, and they play a fundamental role in modern drug discovery. In this paper, we present a new class of methods for building SAR models, referred to as multi-assay based, that utilize activity information from different targets. These methods first identify a set of targets tha...
متن کاملAffinity-based Structure-Activity-Relationship Models: Improving Structure-Activity-Relationship Models by Incorporating Activity Information from Related Targets
Structure-activity-relationship (SAR) models are used to inform and guide the iterative optimization of chemical leads, and play a fundamental role in modern drug discovery. In this paper we present a new class of methods for building SAR models, referred to as affinity-based, that utilize activity information from different targets. These methods first identify a set of targets that are relate...
متن کاملMining public-source databases for structure-activity relationships
Modeling off-target effects has become a highly important and relevant component of the computational chemistry toolset. The presentation will describe a new contribution that seeks to allow the extraction of 3Dstructure-activity relationships from public information starting from a chemical structure. Several public source databases such as PubChem [1] offering structure as well as activity in...
متن کاملAn overview of the PubChem BioAssay resource
The PubChem BioAssay database (http://pubchem.ncbi.nlm.nih.gov) is a public repository for biological activities of small molecules and small interfering RNAs (siRNAs) hosted by the US National Institutes of Health (NIH). It archives experimental descriptions of assays and biological test results and makes the information freely accessible to the public. A PubChem BioAssay data entry includes a...
متن کاملInhibition of Protein Kinase C-Driven Nuclear Factor-κB Activation: Synthesis, Structure−Activity Relationship, and Pharmacological Profiling of Pathway Specific Benzimidazole Probe Molecules
A unique series of biologically active chemical probes that selectively inhibit NF-kappaB activation induced by protein kinase C (PKC) pathway activators have been identified through a cell-based phenotypic reporter gene assay. These 2-aminobenzimidazoles represent initial chemical tools to be used in gaining further understanding on the cellular mechanisms driven by B and T cell antigen recept...
متن کامل